Grapheme Segmentation of Tamil Speech Signals using Excitation Information with MFCC and LPCC Features
نویسندگان
چکیده
The major components of automatic Speech Recognition(ASR)are the pronunciation dictionary, language models, acoustic model and decoder. The Pronunciation dictionaries define the mapping between the words and basic sounds of a language and thus play a vital role in speech recognition systems. Construction of the pronunciation dictionary is expensive and time consuming since it requires the knowledge of the target language. Since phoneme is context dependent, a higher unit than phoneme is considered with the aim to develop a sophisticated tool for speech to text application. The proposed segmentation algorithm is tested on the continuous speech read by 4 native speakers. Energy and spectral centroid features are used to remove the silent portion and vowel onset point is used as the anchor point to find the beginning of the vowel. The proposed method is analysed with various time tolerances and the results are presented. Keywords— Speech Segmentation, Grapheme Segmentation, MFCC, LPCC.
منابع مشابه
Recognition of Tamil Syllables Using Vowel Onset Points with Production, Perception Based Features
Tamil Language is one of the ancient Dravidian languages spoken in south India. Most of the Indian languages are syllabic in nature and syllables are in the form of Consonant-Vowel (CV) units. In Tamil language, CV pattern occurs in the beginning, middle and end of a word. In this work, CV Units formed with Stop Consonant – Short Vowel (SCSV) were considered for classification task. The work ca...
متن کاملAutomatic continuous speech recogniser for Dravidian languages using the auto associative neural network
In recent times with the extensive improvement of computers, numerous methods of data interchange between man and computer are revealed. It aims to provide an efficient way for human to communicate with computers exclusively for people with disabilities who face diversity of obstacles while using computers. This paper predominantly focuses on developing an efficient speech recognition system fo...
متن کاملSpeech recognition of mandarin syllables using both linear predict coding cepstra and Mel frequency cepstra
This paper is to compare two most common features representing a speech word for speech recognition on the basis of accuracy, computation time, complexity and cost. The two features to represent a speech word are the linear predict coding cepstra (LPCC) and the Mel-frequency cepstrum coefficient (MFCC). The MFCC was shown to be more accurate than the LPCC in speech recognition using the dynamic...
متن کاملAutomatic Speaker Recognition using LPCC and MFCC
A person's voice contains various parameters that convey information such as emotion, gender, attitude, health and identity. This report talks about speaker recognition which deals with the subject of identifying a person based on their unique voiceprint present in their speech data. Pre-processing of the speech signal is performed before voice feature extraction. This process ensures the voice...
متن کاملIntegrating Complementary Features from Vocal Source and Vocal Tract for Speaker Identification
This paper describes a speaker identification system that uses complementary acoustic features derived from the vocal source excitation and the vocal tract system. Conventional speaker recognition systems typically adopt the cepstral coefficients, e.g., Mel-frequency cepstral coefficients (MFCC) and linear predictive cepstral coefficients (LPCC), as the representative features. The cepstral fea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017